Picture for Linfeng Li

Linfeng Li

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Add code
Jan 08, 2026
Viaarxiv icon

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing

Add code
Dec 12, 2025
Viaarxiv icon

WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Add code
Dec 11, 2025
Viaarxiv icon

Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models

Add code
May 30, 2025
Figure 1 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 2 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 3 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Figure 4 for Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language Models
Viaarxiv icon

DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation

Add code
May 29, 2025
Viaarxiv icon

An Empirical Study of GPT-4o Image Generation Capabilities

Add code
Apr 08, 2025
Figure 1 for An Empirical Study of GPT-4o Image Generation Capabilities
Figure 2 for An Empirical Study of GPT-4o Image Generation Capabilities
Figure 3 for An Empirical Study of GPT-4o Image Generation Capabilities
Figure 4 for An Empirical Study of GPT-4o Image Generation Capabilities
Viaarxiv icon

Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models

Add code
Nov 14, 2024
Figure 1 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 2 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 3 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Figure 4 for Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models
Viaarxiv icon

Stable Object Placement Under Geometric Uncertainty via Differentiable Contact Dynamics

Add code
Sep 26, 2024
Viaarxiv icon